H-Metric: Characterizing Image Datasets via Homogenization Based on KNN-Queries

نویسندگان

  • Welington M. da Silva
  • José Fernando Rodrigues
  • Agma J. M. Traina
  • Sérgio Francisco da Silva
چکیده

Precision-Recall is one of the main metrics for evaluating content-based image retrieval techniques. However, it does not provide an ample perception of the properties of an image dataset immerse in a metric space. In this work, we describe an alternative metric named H-Metric, which is determined along a sequence of controlled modifications the image dataset. The process is named homogenization and works by altering the homogeneity characteristics of the classes of images in the dataset. The result is a process that measures how hard it is to deal with a set of images in respect to content-based retrieval, offering support in the task of analyzing distance function-features extractor configurations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Effective Approach for Robust Metric Learning in the Presence of Label Noise

Many algorithms in machine learning, pattern recognition, and data mining are based on a similarity/distance measure. For example, the kNN classifier and clustering algorithms such as k-means require a similarity/distance function. Also, in Content-Based Information Retrieval (CBIR) systems, we need to rank the retrieved objects based on the similarity to the query. As generic measures such as ...

متن کامل

Assessment of the Log-Euclidean Metric Performance in Diffusion Tensor Image Segmentation

Introduction: Appropriate definition of the distance measure between diffusion tensors has a deep impact on Diffusion Tensor Image (DTI) segmentation results. The geodesic metric is the best distance measure since it yields high-quality segmentation results. However, the important problem with the geodesic metric is a high computational cost of the algorithms based on it. The main goal of this ...

متن کامل

Minimizing the Number of Keypoint Matching Queries for Object Retrieval

To increase the efficiency of interest-point based object retrieval, researchers have put remarkable research efforts into improving the efficiency of kNN-based feature matching, pursuing to match thousands of features against a database within fractions of a second. However, due to the high-dimensional nature of image features that reduces the effectivity of index structures (curse of dimensio...

متن کامل

Efficiently Supporting Edit Distance Based String Similarity Search Using B $^+$-Trees

Edit distance is widely used for measuring the similarity between two strings. As a primitive operation, edit distance based string similarity search is to find strings in a collection that are similar to a given query string using edit distance. Existing approaches for answering such string similarity queries follow the filter-and-verify framework by using various indexes. Typically, most appr...

متن کامل

Improving K-nearest-neighborhood based Collaborative Filtering via Similarity Support

Collaborative Filtering (CF) is the most popular choice when implementing personalized recommender systems. A classical approach to CF is based on K-nearest-neighborhood (KNN) model, where the precondition for making recommendations is the KNN construction for involved entities. However, when building KNN sets, there exits the dilemma to decide the value of K --a small value will lead to poor r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Data Science Journal

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2011